Federal University of Sao Paulo (UNIFESP), DIS-Bioinformatics
Motivation: Here we propose a method to discriminate closely related species at the molecular level using entropy and mutual information. Sequences of orthologous genes in the same species might contain species-specific covariation patterns that can be identified through mutual information.
Summary: Mutual Information Analyzer (MIA) is a pipeline written in Python with the intent to calculate Vertical Entropy, Vertical and Horizontal Mutual Information. From VH, VMI and HMI distributions, Jensen-Shannon Divergence (JSD) is calculated to estimate the distances between species sequences. Each pair of mutual information distribution distances with their respective standard errors are calculated and stored in distance matrices. These distances between distributions can be presented as histograms or hierarchical cluster dendrograms.
Mutual Information Analyzer (MIA) is a pipeline written in Pytho with the following algorithms:
HMI and VMI are calculated with and without bias corrections, therefore, the gain or loss of information for “mincut” versus “maxmer”, with or without bias correction, can be compared. Distances between distributions are calculated via the square root of JSD. Since Mutual Information and JSD are not linear functions of the data their standard errors are calculated by empirical propagation.
Vertical Shannon Entropy
Vertical Mutual Information 2D Heatmap
Vertical Mutual Information 3D Heatmap
Horizontal Entropy
JSD Histogram
Hierarchical Cluster